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Abstract 

If  an  observer  is  moving  rigidly  with  bounded  rotation  then  normal  flow  measurements  (i.e.,  the  spatiotemporal 
derivatives  of  the  image  intensity  function)  give  rise  to  a  constraint  on  the  oberver’s  translation.  This  novel 
constraint  gives  rise  to  a  robust,  qualitative  solution  to  the  problem  of  recovering  the  observer’s  heading  direction, 
by  providing  an  area  where  the  Focus  of  Expan.sion  lies.  If  the  rotation  of  the  observer  is  large  then  the  solution 
area  is  large  too,  while  small  rotation  causes  the  solution  area  to  be  small,  thus  giving  rise  to  a  robust  solution. 
In  the  paper  the  relationship  between  the  solution  area  and  the  rotation  and  translation  vectors  is  studied  and 
experimental  results  using  synthetic  and  real  calibrated  image  sequences  are  presented.  This  work  demonstrates 
that  the  algorithm  developed  in  (Horn  and  Weldon  1987)  for  the  case  of  pure  translation,  if  appropriately 
modified,  results  in  a  robust  algorithm  that  works  in  the  case  of  general  rigid  motion  with  bounded  rotation. 
Subsequently,  it  has  the  potential  to  replace  expensive  accelerometers,  inertial  systems  and  inaccurate  odometers 
in  practical  navigational  systems  for  the  problem  of  kinetic  stabilization,  which  is  a  prerequisite  for  any  other 
navigational  ability. 


1  Introduction 

The  problem  of  passive  navigation  has  attracted  a  lot 
of  attention  in  the  past  ten  years  (Bruss  and  Horn 
1983;  Longuet-Higgins  1981;  Longuet-Higgins  and 
Prazdny  1980;  Spetsakis  and  Aloimonos  1988;  Tsai 
and  Huang  1984;  Ullman  1979)  because  of  the  gen¬ 
erality  of  a  potential  solution.  The  problem  has  been 
formulated  as  follows;  Given  a  sequence  of  images 
taken  by  a  monocular  observer  undergoing  unre¬ 
stricted  rigid  motion  in  a  stationary  environment,  to 
recover  the  3-D  motion  of  the  observer.  In  partic¬ 
ular,  if  (U,  V,  W)  and  (a)x,<Oy,ot)f)  are  the  transla¬ 
tion  and  rotation,  respectively,  comprising  the  general 
rigid  motion  of  the  observer,  the  problem  is  to  recover 
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the  following  five  numbers;  the  direction  of  transla¬ 
tion  (^,  ^)  and  the  rotation  (co^,  coy,  to-).  (See  Figure 
1  for  a  pictorial  description  of  the  geometric  model 
of  the  observer;  O  is  the  nodal  point  of  the  eye). 

The  problem  has  thus  been  formulated  as  the  gen¬ 
eral  3-D  motion  estimation  problem  (kinetic  depth 
or  structure  from  motion)  and  its  solution  would 
solve  a  series  of  problems  (for  example  target  pur¬ 
suit,  visual  rendezvous,  etc.)  as  simple  applications. 
In  this  paper  we  study  the  problem  of  passive  nav¬ 
igation  in  the  fiamework  of  purposive  vision  (Aloi¬ 
monos  1990a;  Aloimonos  1992).  Our  basic  thesis  is 
that  we  must  seek  a  robust  solution  for  the  prob¬ 
lem  under  consideration  only.  If  our  proposed  solu¬ 
tion  for  the  passive  navigation  problem  also  solves 
the  problem  of  determining  the  3-D  motion  of  an 
object  moving  in  the  field  of  view  of  a  static  ob¬ 
server,  then  we  have  solved  a  more  general  prob¬ 
lem  than  the  one  we  initially  considered.  In  ad¬ 
dition,  the  technique  has  qualitative  characteristics. 
For  an  exanq>le  of  qualitative  approaches  to  visual 
motion  problems,  see  Burger  and  Bhanu  (1990), 
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Francois  and  Bouthemy  ( 1990),  Thompson  and  Kear¬ 
ney  (1986),  Thompson  and  Painter  (1992),  Tistarelli 
and  Sandini  (1992),  Weinshall  (1990),  Weinshall 
(1991),  Zisserman  and  Cipolla  (1990),  and  Fermiiller 
(1993a), 

2  Previous  Work 

Previous  research  can  be  classified  into  two  broad  cat¬ 
egories:  methods  based  on  optic  flow  or  correspon¬ 
dence  and  direct  methods.' 

In  the  first  category,  whose  uniqueness  properties 
are  well  understood  (Faugeras  and  May  bank  1990), 
under  the  assumption  that  optic  flow  or  correspon¬ 
dence  is  known  with  some  uncertainty,  finding  the 
best  solution  results  in  a  non-linear  optimization  prob¬ 
lem.  One  develops  an  error  measure  (usually  a  func¬ 
tion  of  the  input  error)  that  is  minimized  in  some 
way.  Treating  the  problem  as  one  of  statistical  esti¬ 
mation  has  given  rise  lately  to  very  sophisticated  ap¬ 
proaches.  Although  such  research  on  general  recovery 
is  making  tremendous  progress,  the  existing  general 
recovery  results  cannot  yet  survive  in  the  real  world, 
because  small  amounts  of  error  in  the  input  can  pro¬ 
duce  very  large  errors  in  the  output  (Spetsakis  and 
Aloimonos  1988;  Horn  1990;  Young  and  Chellappa 
1988;  Weng,  Huang,  and  Ahuja  1987).  Although  it 
is  true  that  if  a  human  operator  corresponded  fea¬ 
tures  in  the  successive  image  frames,^  most  of  these 
algorithms  would  give  practical  results,  it  is  highly 
questionable  that  these  algorithms  could  be  used  in  a 
real  time  navigational  system,  when  an  average  of  1% 
input  noise  is  enough  to  create  an  error  of  1(X)%  in 
the  output,^  and  especially  when  the  problem  of  com¬ 


puting  optic  flow  or  displacements  (correspondence) 
is  ill-posed  and  any  algorithm  for  computing  them 
must  rely  on  assumptions  about  the  world  that  might 
not  always  be  valid.  There  is  no  doubt  that  research 
on  the  topic  will  continue  and  will  shed  more  light  on 
the  difficulties  associated  with  the  general  problem  of 
3-D  motion  computation. 

In  the  second  category,  direct  methods  attempt  to 
recover  3-D  motion  using  as  input  the  spatiotempo- 
ral  derivatives  of  the  image  intensity  function,  thus 
getting  rid  of  the  correspondence  problem.  These 
techniques  were  pioneered  in  (Aloimonos  and  Brown 
1984)  for  the  case  of  pure  rotation  and  developed 
much  further  by  Horn  and  his  associates  (Horn  and 
Weldon  1987;  Negahdaripour  1986;  White  and  Wel¬ 
don  1987)  for  the  case  of  translation  only.  Recently 
Fermiiller  addressed  the  general  case  (unrestricted 
rigid  motion)  (Fermiiller  1993b)  by  discovering  geo¬ 
metric  constraints  on  the  normal  flow  signs  that  take 
the  form  of  global  patterns  in  the  image  plane.  Here, 
we  treat  the  general  problem  but  for  the  case  where 
the  rotation  is  bounded.  When  this  paper  was  under 
review  it  came  to  our  attention  that  the  same  result 
was  developed  independently  in  (Blake.  Murray,  and 
Sinclair  1992). 


3  Kinetic  Stabilization 

Consider  a  monocular  observer  as  in  Figure  2. 
We  assume  that  the  observer  moves  only  forward 
(see  Figure  3).‘*  It  is  assumed  that  the  observer  is 
equipped  with  inertial  sensors  which  provide  the 
rotation(<w,, &>y,  &>,)  of  the  observer  at  any  time.  As 
the  observer  moves  in  its  environment,  normal  flow 
fields  are  computed  in  real  time.  Since  optic  flow  due 
to  rotation  does  not  depend  on  depth  but  on  image 
position  (x,  y),  we  know  (and  can  compute  in  real 
time)  its  value  («'*,  u*)  at  every  image  point  along 
with  the  normal  flow.^  That  means  that  we  know  the 
normal  flow  due  to  translation  (see  Figure  3a).  In 
other  words,  since  we  can  derotate,  we  assume  that 
the  normal  flow  is  due  to  translation  only.  In  later  sec¬ 
tions  we  analyze  the  case  where  rotation  is  present. 
When  the  observer  moves  forward*  in  a  static  scene, 
it  is  ^proaching  anything  visible  in  the  scene  and 
the  flow  is  expanding.  From  Figure  3b,  it  is  clear  that 
the  focus  of  expansion  (FOE)  =  (^,  ^)  (when  the 
gradient  space  of  directions  is  superimposed  on  the 
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Fig  2. 
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image  space)  lies  in  the  half  plane  defined  by  line  (e); 
thus  every  point  in  that  half  space  receives  one  vote 
for  being  the  FOE.  Clearly,  at  every  point  we  obtain 
a  constraint-line  which  constrains  the  FOE  to  lie  in  a 
half  plane.  If  the  FOE  lies  on  the  image  plane  (i.e.  the 
direction  of  translation  is  anywhere  in  the  solid  sector 
OABCD  (Figure  4))  then  the  FOE  is  constrained  to 
lie  in  an  area  on  the  image  plane  and  thus  it  can  be 
localized  (see  Figure  5).  When  the  FOE  does  not  lie 
inside  the  image,  a  closed  area  cannot  be  found,  but 
the  votes  collected  by  the  half  planes  indicate  its  gen¬ 
eral  direction.  By  making  a  “saccade",  i.e.  a  rotation 
of  the  camera,  the  observer  can  then  bring  the  FOE 
inside  the  image  and  localize  it  (Figure  6  explains  the 
process). 

An  algebraic  way  to  derive  the  same  constraint 
(Horn  and  Weldon  1987)  is  as  follows;  If  f(x,y,t) 
is  the  image  intensity  function,  then  we  have  /,«-(- 
fyV  +  ft  =  0,  where  u,  v  is  the  flow.  If  we  only 
have  translation  (or  we  know  the  rotation),  then 
we  get  +  /y(^)  +  /,  =  0  or 

/4  (^  -  w)  +  />f  (>  -  W)  +  /- =  0  and  if  f  >  0, 
(/r  +  fy  (y  -  w))  lf‘  <  0- 


This  linear  inequality  in  (i.e.  the  FOE)  con¬ 

strains  the  FOE  to  lie  on  one  side  of  the  line  normal  to 
(/,,  fy).  The  contribution  of  this  paper  is  to  show  that 
this  simple  constraint  intersection  technique,  when 
appropriately  modified,  works  even  in  the  presence 
of  rotation. 

4  The  Algorithm 

We  assume  that  the  computation  of  the  normal  flow, 
the  voting  and  the  localization  of  the  area  containing 
the  highest  number  of  votes  can  be  done  in  real  time. 
In  this  paper  we  don’t  get  involved  with  real  time 
implementation  issues  as  we  wish  to  analyze  the  the¬ 
oretical  aspects  of  the  technique.  However  it  is  quite 
clear  that  computation  of  normal  flow  can  be  done  in 
real  time  (there  already  exist  chips  performing  edge 
detection).  According  to  the  literature  on  connection- 
isc  networks  (Ballard  1984),  voting  can  also  be  done 
in  real  time.  Let  5  denote  the  area  with  the  high¬ 
est  number  of  votes.  Let  L(5)  be  a  Boolean  function 
that  is  true  when  the  intersection  of  S  with  the  image 
boundary  is  the  null  set,  and  false  otherwise.  Then, 
the  following  algorithm  finds  the  area  5,  i.e.,  solves 
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3c. 


fig.  J.  Given  the  normal  flow  «"  and  the  rotational  flow  iJ*  at  a  point  0(x,  y).  and  given  that  the  projection  of  the  sum  ur  +  u,  on  if'  i 
should  equal  a",  we  conclude  that  the  transitional  flow  is  OI>,  where  D  is  anywhere  on  (f’)  Clearly,  in  such  a  case,  the  locus  of  expansio  i 
lies  on  the  half  plane  defined  by  (<)  that  does  not  contain  u,.  This  statement  is  equivalent  to  the  following  algebraic  inequality  (Horn  ai  J 
Weldon  1987).  If  f(x,  y,  ()  is  the  image  intensity  function,  then  we  have  /,«  +  fyv  +  /,  =  0.  where  u.  v  is  the  flow.  If  we  only  have 
translation  (or  we  have  rotation),  then  we  get  f,  ('--j-— )  +  /y(~''^'— )  +  /,  =  0  or  (jt  ~  w)  +  fyT  (>’  ~  w)  f' 

Y  >  0.  (/,  (i  —  +  fy  (y  -  ^  j)  //,  <  0.  However,  thinking  in  terms  of  normal  flow  due  to  translation  is  as  in  Figure  3b,  the  FGF 

must  lie  in  the  half  plane  (dotted  line)  of  (f).  But  this  assumes  that  the  flow  u  can  be  arbiu-arily  large,  which  is  absurd.  If  there  is  a  bound 
on  the  flow,  then  the  FOE  is  constrained  further  (Figure  3c). 


the  passive  navigation  problem.  We  assume  that  the 
inertial  sensors  provide  the  rotation  and  thus  we  know 
the  normal  flow  due  to  translation 

1.  begin  { 

2.  find  area  5 

3.  repeat  until  L(.S) 

4.  {  rotate  camera  around  x,  y  axes  so  that 

the  optical  axis  passes  through  the 
center  of  S  (saccade) 

5.  find  area  5 


1 

6.  output  S 

} 

If  the  camera  has  a  wide  angle  lens,  then  image 
points  can  represent  many  orientations,  and  only  one 
saccade  (or  none)  may  be  necessary.  But  if  we  have 
a  small  angle  lens,  then  we  may  have  to  make  more 
than  one  saccade. 


Fig.  4.  Consider  the  camera  coordinate  system.  If  the  translation 
vector  (V.  V,  W)  is  anywhere  inside  the  solid  OABCD  defined 
by  the  nodal  point  of  the  eye  and  the  boundaries  of  the  image,  then 
the  FOE  is  somewhere  on  the  image. 

5  Analysis  of  the  Method 

We  have  assumed  that  the  inertial  sensors  will  provide 
the  observer  with  accurate  information  about  rotation. 
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Fig.  5.  (a)  From  a  measurement  of  u"  of  the  normal  flow  due  to  translation  at  a  point  (t,  y)  of  the  image,  every  point  of  the  image  belonging 
to  the  half  plane  defined  by  (e)  that  does  not  contain  «  is  a  candidate  for  the  position  of  the  focus  of  expansion,  and  collects  one  vote.  The 
voting  is  done  in  parallel  for  every  image  measurement,  (b)  If  the  FOE  lies  within  the  image  boundaries,  then  the  area  aintaining  the  highest 
number  of  votes  is  the  area  containing  the  FOE.  Using  only  a  few  measurements  can  result  in  a  large  area.  Using  many  measurements  (all 
possible)  results  in  a  small  area  (in  our  experiments  a  small  area  means  a  few  pixels,  usually  at  most  three  or  four). 


Although  expensive  accelerometers  can  achieve  very 
high  accuracy,  the  same  is  not  true  for  inexpensive 
inertial  sensors  and  so  we  are  bound  to  have  some 
error.  Thus  we  must  assume  that  some  unknown  ro¬ 
tational  part  still  exists  and  contributes  to  the  value  of 
the  normal  flow.  As  a  result,  the  method  for  finding 
the  FOE  (previous  section)  which  is  based  on  transla¬ 
tional  normal  flow  information  (since  we  have  “dero- 
tated”)  might  be  affected  by  the  presence  of  some 
rotational  flow.  In  this  section,  we  study  the  effect 
of  rotation  (the  error  of  the  inertial  sensor)  on  the 
technique  for  finding  the  FOE. 

In  order  to  avoid  artificial  problems  introduced  by 
perspective  distortions  in  the  case  of  a  planar  retina 
and  to  simplify  the  formulas  without  loss  of  general¬ 
ity,  we  employ  a  spherical  retina.  Let  a  sphere  with 
radius  /  and  center  O  (Figure  7)  represent  the  spher¬ 
ical  retina  (with  O  the  nodal  point  of  the  eye)  and  a 
coordinate  system  OXYZ  attached  to  it. 

Let 


To,  =  (X,  y,  Z)  be  a  world  point 


and 

r  =  (x,  >’,  z)  be  its  image  on  the  image  plane. 
Then 


/ 


^,  /?  =  llrwll  =  •  fw 


It  can  easily  be  shown  (Koenderink  and  van  Doom 
1975;  Maybank  1985)  that 


—  » 
~  '3? 


-Ff  +  j(FF) 


—  oj  X  r 


Thus,  the  translational  flow  is 

F{F  ■  F) 


-t/  + 


f 


while  the  rotational  flow  is  given  by 
Mr  =  —w  X  r. 


(1) 


Without  loss  of  generality  we  can  set  /  =  1. 

At  this  point  we  define  two  quantities  that  will 

be  of  use  later.  They  are  r  =  which  is  related 

11/ II 

to  the  time  to  collision,  and  k  — 

which  represents  the  effective  ratio  of  rotation  and 
translation. 

The  geometry  of  the  spherical  projection  is  then 
given  in  Figure  8.  It  has  been  shown  (Nelson  and 
Aloimonos  1988)  that  a  fiill  (360°)  visual  field  sim¬ 
plifies  motion  analysis.  However,  what  we  usually 
have  is  just  a  piece  of  the  surface  of  the  sphere  (due 
to  a  limited  field  of  view).  Consider  then  that  the  im¬ 
age  (the  part  that  we  see)  is  projected  on  the  surface 
patch  S.  Obviously,  voting  for  the  estimation  of  the 
FOE  can  be  performed  for  all  points  on  S. 
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(a) 


Fig.  6.  (a)  If  the  area  containing  the  highest  number  of  votes  has  a  piece  of  tlie  image  boundary  as  part  of  its  boundary,  then  the  FOE 
is  outside  the  image  plane  (see  6b).  (b)  The  position  of  the  area  containing  the  highest  number  of  votes  indicates  the  general  direction  in 
which  the  translation  vector  lies,  (c)  The  camera  (“eye")  rotates  so  that  the  area  containing  the  highest  number  of  votes  becomes  centered. 
With  a  rotation  around  the  x  and  y  axes  only,  the  optical  axis  can  be  positioned  anywhere  in  space.  The  process  stops  when  the  highest 
vote  area  is  entirely  inside  the  image. 


5.1  Principles  of  Voting 
Consider 

r,  =  (x,  V,  z),  a  point  in  S, 

"i  =  (fix.  fly.  the  image  gradient  direction 
at  point  r; , 

r,  =  Mi  =  (ux.  My,  Mj),  the  flow  at  point  r,,  and 
=  (fii  •  «i)  •  "i.  the  normal  flow  at  r,  . 


Fig.  7. 


Then  (see  Figure  9)  if  ?  =  (x,  y,  z)  is  a  point  in  S,  a 
feature  point  ?,  will  vote  for  r  being  the  FOE  (direc¬ 
tion  of  translation)  iff  ii?(r  -  r,  )  <  0  (see  Figure  9). 
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Fig.  8. 


o 

Fig.  9. 


If  V[r]  represents  the  number  of  votes  collected 
at  point  r,  then  it  is  easy  to  see  that 

V[r]  =  -r)] 

r,eS 

where 


U{x)  = 


I,  X  >  0 

0,  X  <  0 


Let  S'  =  jrjVr'  e  S,  ^[r]  >  V[r']|  be  the  set  of 
points  that  have  acquired  the  maximum  number  of 
votes.  There  are  two  cases: 


Case  1:  S'  does  not  intersect  the  border  of  5,  in 
which  case  the  FOE  is  in  S'. 

Case  2:  S'  touches  the  border  of  S,  in  which  case  the 
FOE  could  be  outside  of  S. 

It  should  be  clear  that  if  there  is  no  rotation,  then  S' 
will  always  contain  the  FOE  or  give  the  direction  of 
the  FOE — i.e.  the  direction  towards  which  we  need 
to  rotate.  The  size  of  S'  depends  on  the  distribution 
of  features. 

In  the  sequel  we  investigate  the  performance  of 
the  voting  scheme  in  the  presence  of  rotation.  In 
particular  we  ask  how  large  area  S  is  when  rota¬ 
tion  is  present.  It  will  be  shown  that  this  depends 
on  die  angle  0^,  between  the  direction  of  translation 
and  the  axis  of  rotation  as  well  as  on  the  rotation-to- 
translation  ratio  k.  In  particular,  distorts  area  S' 
and  k  enlarges  it  as  it  grows.  The  rest  of  the  paper 
quantifies  this  interaction. 

Before  we  proceed  with  the  analysis  we  introduce 
a  natural  coordinate  system  that  greatly  simplifies  the 
calculations. 


5.2  A  Natural  Coordinate  System 

Since  spherical  projection  is  symmetrical  we  can 
choose  a  coordinate  system  that  facilitates  our  analy¬ 
sis.  We  de^.ne  a  new  orthonormal  coordinate  system 
with  unit  vectors  a,  =  i,  Oy  =  y,  O;  =k  (defined  by 
Figure  10): 


Fig.  10. 
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iftox  t  jtO 

IIWx/ll 

any  unit  vector  such  that 

oi  •  =  0  otherwise. 


and 


-*  -•  11*  II  • 

n  ‘  Ut  =  Siu  Or  cos  a 

A 


For  rotational  normal  flow  we  get 


n  -ug  =  —(a)  X  r)n(  costr  —  (w  x  r)n,sina 


(Jy  -  CTj  X  Oj.  . 


Also, 


In  spherical  coordinates  we  get 
t  =  ||?||(0,0,  1) 
o)  =  ||w||(0,  sin^„,  cos0<„) 
r  =  (cosy),  sin 0,,  sin i^r  sin  cos 0r) 


So, 

t  =  (0,0,  W)  :=  Wk  for  some  W 

01  —  (0,  B,  C)  :=  Bj  +  Ck  for  some  B  and  C. 


Similarly,  we  define  a  coordinate  system  {^,  n) 
which  lies  on  the  plane  tangent  to  the  sphere  at  point 
r  =  (jt,  >■,  z).  This  tangent  plane  is  spanned  by  the 
vectors 


"t 


«(  -  k  X  r 

and  n„  =  -  with  n,  •  n„  =  0. 

ii«,ii  ”  Px;ii  " 


Any  flow  vector  lies  on  the  tangent  plane;  there¬ 
fore  it  will  be  a  linear  combination  of  vectors  and 


Now  we  are  ready  to  express  normal  flow  in  the 
new  coordinate  system.  Consider  a  feature  with  gra¬ 
dient  direction 


n  =n(  cos  a  +  n,,sin  a 
The  translational  normal  flow  is 
it  ■  u,  =  I cosa -(-«'„sinQr  I  u,  = 


IlUfll 

Also, 

115/ II  = 

'plP  ^ 

-(r  •  r)  - 


— (a>  X  r)n( 


and 

-(a>  X  r)n,, 


{a)y.r)Ui 
ll«i  II 


(co  X  r) 

ll«rll 


L  4.  rU  -T) 

K  H 


•  4  =  -  llwll  sin  0,„  cos  ifir 

III/ 1 II 


(d>xF)lkxr) 

ll/txrti 

-sin0,  ||^||icos0(„sin0r-sin0a,cos0rsin»v  | 
sin^ 


so  that 


it  ug  —  lia>||[— cos0a,sin0rsina 

-t-  sin  0„  cos  9r  sin  ifir  sin  a  -  sin  6^  cos  y),  cos  or] 


5.3  Correctness  of  Voting  in  the  Presence  of  Rotation 

The  normal  flow  (as  well  as  actual  flow)  is  very  small 
in  the  region  close  to  the  FOE,  and  in  the  directions 
close  to  orthogonal  to  the  directions  of  the  flow.  Con¬ 
sequently,  even  when  only  translation  is  present,  in 
order  to  avoid  inaccuracies  that  might  arise  in  the  es¬ 
timated  direction  of  the  normal  flow — numerical  ma¬ 
nipulation  of  very  small  quantities  is  unstable — we 
are  going  to  discard  any  normal  flow  whose  mag¬ 
nitude  is  less  than  some  threshold  T,.  Later,  it  will 
turn  out  that  choosing  this  threshold  greatly  facili¬ 
tates  the  geometrical  analysis  of  the  technique.  Con¬ 
sidering  an  actual  flow  u  at  a  point  A  (see  Figure 
1 1)  we  can  compute  the  locus  of  gradient  directions 
n  along  which  the  normal  flow  (i.e.  the  projection  of 
u  on  r)  is  bigger  than  the  threshold  T,.  In  Figure  1 1 
they  are  all  directions  inside  angle  BAC  defined  by 


Estimating  the  Heading  Direction  Using  Normal  Flow  41 


ft)  =  arccos  for  -5-  <  1 ,  or  there  are  no  such 

iiuil  tiwii  ~ 

directions  for  -5-  >  1. 

null 

We  now  develop  a  condition  that  needs  to  be  sat¬ 
isfied  in  order  for  voting  at  a  point  to  be  correct  in 
the  presence  of  rotation. 

Voting  will  clearly  be  correct  only  if  the  direction 
of  the  translational  normal  flow  is  the  same  as  the 
direction  cf  the  actual  normal  flow,  that  is  when 


the  rotational  flow  is 

<  l|n||  ■  =  1Im«II  =  llw  X  rll  = 

=  llwll  •  ll^ll  ■  I  sinfZw.  r)l  <  ||a»|| 

"nius  if  we  choose  T,  =  ||w||,  then  the  sign  of  h  u 
(actual  normal  flow)  is  equal  to  the  sign  of  it,  ■  n 
(translational  normal  flow)  for  any  normal  flow  of 
magnitude  greater  than  T,. 

5.4  The  Geometry  of  the  Solution  Area 

The  introduction  of  the  threshold  T,  into  our  analy¬ 
sis  has  a  beneficial  side-effect,  since  this  constrains 
the  possible  gradient  directions  at  every  point  where 
we  can  vote.  As  a  consequence  we  can  estimate  the 
size  of  the  smallest  possible  area  that  we  might  find 
as  a  solution.  We  need  to  caution  the  reader  that  in 
the  case  of  pure  translation  the  solution  area  (the  area 
containing  the  FOE)  contains  the  uncertainty  area  (the 
area  where  the  values  of  the  normal  flow  do  not  al¬ 
low  voting  to  be  performed).  However,  when  rotation 
is  present  then  the  solution  area,  in  general,  does  not 
contain  the  uncertainty  area.  The  reason  for  this  is  that 
points  far  away  from  the  position  of  the  FOE  might 
constrain  the  solution  area  more  than  the  uncertainty 
area  does.  The  following  two  sections  quantify  this 
analysis. 


(n  ■  Ur)(h  ■  a)  >  0  (2) 

In  addition,  since  we  consider  only  normal  flows 
greater  than  threshold,  we  need 


|n«|>r,  (3) 

Inequality  (2)  becomes 

(h  u,)(h  u)  =  (h  ■  u,)(h  ■  u,  +  h  ■  u/()  = 

-  -  ->  ^  ,  (4) 

=  (n  u,)  +(n  ■  u,)(n  ■  ««)  >  0 


So,  if  we  set  \n  ■  5^1  =  T,,  then  there  are  two 
possibilities:  either  |n  ■  u|  is  below  the  threshold,  in 
which  case  it  is  of  no  interest  to  voting,  or  the  sign  of 
ii '  u  is  the  same  as  the  sign  of  n  -  u,.  In  other  words, 
if  we  can  set  the  threshold  equal  to  the  maximum 
value  of  the  normal  rotational  flow,  then  our  voting 
will  always  be  correct.  But  at  point  r  of  the  sphere 


The  Case  of  Trawslation  Depending  on  the  thresh¬ 
old  Tt,  there  is  n  point  closest  to  the  FOE  for  which  a 
feature  (normal  flow)  can  be  registered.  It  is  obtained 
when 

sin^r  cosQfl  =  T,  (5) 

R 

It  is  obvious  that  for  6r  smaller  than  some  threshold 
Gr„  (5)  never  holds;  for  Or  =  Or^,  (5)  holds  only  for 
Of  =  0,  when  ft.  grows  this  cone  of  directions  for  h 
(gradient)  or  range  of  a’s  grows  and  for  Or  =  it 
reaches  |a|  €  [0,  ^  -  ^r,,]  if  the  R's  are  the  same  at 
both  points. 

TTiis  increasing  range  of  a’s  for  increasing  Or  can 
be  viewed  as  an  increasing  density  of  features  (fea¬ 
tures  are  registered  only  for  a’s  in  the  cone).  But  it  is 
easy  to  show  that  feature  points  f-,  with  increasing  0 
will  vote  for  some  points  with  ft  >  0^  and  will  vote 
for  some  points  with  ft  <  ft,, .  This  htqapens  when  for 
a  >  0  the  FOE  constraint  line  gets  slanted.  The  effect 
is  shown  in  Figure  12. 


■;? 

■% 

'"i 
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Fig.  12. 


Using  spherical  trigonometry  (the  law  of  sines) 
(Kom  and  Korn  1968)  for  triangle  AB  (FOE),  we 
get 

siny  =  sin^  sin^r,  (6) 

Since  n  is  normal  to  great  circle  f.  we  have 
O'  +  ^  =  I  and  (6)  becomes 

sin  y  =  cos  a  sin  6^,  (7) 

The  normal  translational  flow  is  given  by 

-  -  ■  o 

u,  ■  n  =  ——  sin  Or  cos  a 

R 

And  so,  setting  |m,  •  n|  =  T,  and  using  (7),  we 
find  the  most  restrictive  voting,  i.e.  the  smallest  pos¬ 
sible  y: 

T  '  R 

sin  y  =  —  =  sin  9r„  (8) 

11,(1 

We  prove  here  that  the  voting  function  V(rJ  (intro¬ 
duced  in  Section  5.1)  is  non-decreasing  on  any  great 
circle,  as  we  move  from  the  south  to  the  north  pole 
where  the  FOE  is  assumed  to  be.*  To  remind  the 
reader,  voting  at  a  feature  point  r;  increases  by  one 
the  votes  of  every  point  in  the  northern  hemisphere’ 
(see  Figure  13,  defined  by  the  great  circle  normal  to 
the  gradient  at  the  feature  point  n.  Consider  a  great 
circle  5A(FOE).  All  points  on  the  arc  S A  receive 
zero  votes,  while  each  point  on  A  (FOE)  receives 
one  vote.  Consequently,  since  each  voting  process 
can  only  increase  the  votes,  the  number  of  votes  is 
non-decreasing  as  we  move  closer  to  the  FOE.  niis 


simply  means  that  the  .solution  area  (i.e.  S')  on  the 
sphere  will  always  be  closed  and  contain  the  FOE. 
Its  size,  however,  could  be  large  if  the  distribution  of 
features  is  not  favorable.  In  addition,  if  voting  is  done 
only  on  a  surface  patch  S  (a  limited  visual  field),  S 
could  be  open. 


The  Case  of  Nonzero  Rotation  In  the  presence  of 
rotation,  voting  will  still  yield  a  closed  area  on  the 
sphere  which  can  in  general  be  larger.  Here  we  study 
properties  of  the  shape  and  size  of  the  solution  area 
Due  to  the  u.se  of  threshold  T,  =  points  in  an 
area  around  the  FOE  will  not  be  used  for  voting  The 
size  and  shape  of  the  solution  area  w  ill  depend  on  the 
angle  0,„  between  t  and  <7;  and  the  threshold  T,  =  |!r7,j|. 
We  first  consider  the  case  where  I  and  co  rue  ptuallel. 
Then  the  normal  flow  is  given  by  (see  Section  5,2) 


«  •  ((  =  sin  ft. 


■  cos  a  —  w  I  sini/ 


Now  we  find  the  angular  distance  ff,,.  between  the 
FOE  and  the  closest  point  to  it  on  the  sphere  that  can 
vote,  i.e,  the  closest  point  with  normal  flow  equal  to 
T,  =  ||a!>l|.  It  is  clear  that  the  point  closest  to  the  FOE 
that  can  vote  is  the  one  at  which  the  maximum  pos¬ 
sible  normal  flow  is  equal  to  T,.  Maximum  normal 
flow  is  obtained  when  the  direction  of  the  gradient 
is  the  same  as  the  direction  of  the  actual  flow.  That 


Fig.  IS. 
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happens  for  the  angle  a  for  which 


-  -  •  ^  •  1 
n  «=sin^,  —  cosft  -  ||<(j||sina 

R 


is  maximized,  i.e.  for 
tro  =  —  arctan 


(Br) 

VllfH  / 


Thus,  the  angular  distance  obeys 

iirii  ......  ' 


n  ■  u  —  sin  6, 


R 


cosoro  —  ||£iii|sinQ'o 


tanS,„  = 

“  11/ II 


As  was  said  before,  if  the  uncertainty  area  is  not 
contained  in  the  image,  then  the  solution  area  will  not 
be  closed;  otherwise  things  depend  on  the  distribution 
of  features,  and  any  feature  point  might  further  con¬ 
strain  the  solution  area  more,  as  was  shown  in  Figure 
12. 


The  rest  of  this  section  describes  various  proper¬ 
ties  of  the  solution  area  as  the  relevant  parameters 
vary. 

If  Or  =  6r^  then  a  =  Or,,  (Figure  12)  and  the  con¬ 
straint  line  (circle)  will  come  as  close  as  y  to  the 
FOE  where 


sin  y  =  sin  dr^  sin  P  = 

=  sin 6^0  sin(  j  —  a)  =  j  sin  20,0 

If.  however,  a  ^  6r^  (we  might  even  have  a  =  j), 
as  the  cone  of  normal  flows  around  the  flow  grows 
(Figure  1 1 ),  this  constrains  the  area  around  the  north 
pole  (FOE)  even  more.  Unlike  the  case  of  transla¬ 
tion  only,  voting  further  restricts  the  FOE  when  fea¬ 
ture  points  are  moving  away  from  it  and  the  area  for 
which  voting  is  maximum  becomes  smaller  than  the 
uncertainty  area.  On  the  other  hand,  if  w  x  r  0  but 
u)  =  (0,  B,  C),  things  become  unsymmetrical  around 
the  FOE.  Using  the  already-defined  coordinate  sys¬ 
tem  Oxyz,  o)  is  defined  by  ||w|(,  =  y  (the  analog 

of  tpr — see  Figure  10),  and  Ba,. 

When  the  angle  becomes  greater  than  zero  and 
acquires  a  small  value,  a  subtle  change  in  the  uncer¬ 
tainty  area  occurs.  The  flow  values  at  the  points  on 
the  borda  of  that  area  increase  or  decrease  depend¬ 
ing  on  their  positions.  Since  flow  is  continuous  in 
the  point  or  points  for  which  HhU  =  will  stay 
close  to  the  border.  The  effect  is  that  the  borderiine 


||u||  ==  ||(i)||  will  change  its  shape  in  the  same  way.  If 
the  flow  increases,  the  border  of  the  uncertainty  area 
will  shrink  closer  to  the  north  pole  (FOE)  and  if  flow 
decreases,  it  will  stretch  the  border  away  from  the 
north  pole 

It  happens  that  with  growth  of  0„  and  Hoiil  this  area 
stretches  away  in  direction  a)  x  F  and  it  shrinks  in  the 
opposite  direction.  The  exact  shape  of  the  area  for  a 
given  u).  t  and  R  =  R{ifi.  9)  can  be  computed  numer¬ 
ically  (the  border  is  defined  by  ||u||  =  ||u,  -)-  wk||  = 
lltoll).  The  effect  of  the  change  in  shape  of  the  area 
is  as  if  the  FOE  moved  in  the  x  —  z  plane  (ip,  =  0). 
Again,  if  the  area  were  not  completely  closed  (was 
intersected  by  the  image  patch  S).  the  solution  area 
(with  maximum  voting)  would  not  be  closed  If  the 
area  were  in  the  image,  feature  points  outside  it  would 
further  constrain  the  area  which  contains  the  FOE  be¬ 
cause  of  the  slant  of  the  FOE  constraint  lines  in  each 
feature  point.  Figure  14  demonstrates  graphically  the 
evolution  of  the  uncertainty  area  for  different  values 
of  9^  and  k.  Each  figure  is  produced  by  projecting 
every  point  of  the  northern  hemisphere  on  the  plane 
tangent  to  the  north  pole  (FOE)  with  the  south  pole 
as  the  center  of  projection  (stereographic  projection). 
Figure  14(a)  shows  the  evolution  of  the  uncertainty 
area  for  k  =  0.25  and  0„  =  0.  y.  |.  with  the  un¬ 
certainty  area  centered  for  9^  =  0.  and  completely 
offset  for  00,  =  j  (with  all  other  values  in  between). 
Similarly.  Figures  14(b)  and  14(c)  show  the  same  re¬ 
sults  for  k  =  0.5  and  0.75.  respectively. 


The  Case  of  Dominant  Rotation  Although  the  tech¬ 
nique  described  in  this  paper  was  derived  to  solve  the 
problem  of  kinetic  stabilization  it  turns  out  that  it  has 
general  applicability.  It  can  be  modified  to  handle  the 
case  of  dominant  rotation  with  translation. 

For  the  case  of  pure  rotation  and  a  spherical  retina 
the  optical  flow  will  correspond  to  vectors  tangent  to 
the  circles  around  the  axis  of  rotation  ru.  The  point  at 
which  the  axis  of  rotation  passes  through  the  image 
will  be  called  the  AOR.  If  there  is  circular  optical 
flow  in  the  image  (due  to  pure  rotation)  the  centn 
of  all  the  circles  is  the  AOR.  If  we  lake  an  arbitrary 
optical  flow  vector  ua  at  the  point  r,  then  we  can  say 
that  a  point  r  is  a  candidate  for  the  AOR  if 

(Zj  X  ua)r  <  0. 

Hiis  inequality  expresses  the  fact  that  the  fixture 
point  and  the  flow  vector  at  the  point  span  the  plane  p 
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(a) 


fig.  14.  (a)  The  evolulion  of  the  uncertainty  area  for  t  =  0.25  and  =  0,  f.  f ,  with  the  uncertainty  area  centered  for  =  0.  and 
completely  offset  for  =  j  (with  all  other  values  in  between),  (b)  The  same  results  lor  *  =  0.5.  (c)  The  same  results  for  k  -  0.75. 


which  cuts  the  sphere  in  two  hemispheres  where  one 
contains  all  possible  candidate  points  for  the  AOR 
(and  all  of  them  satisfy  the  previous  inequality).  Fur¬ 
thermore,  all  possible  positions  of  the  AOR  lie  on 
the  great  circle  which  is  normal  (on  the  sphere)  to 
the  great  circle  which  is  the  intersection  of  the  plane 
p  and  the  image  sphere.  In  other  words  if  we  replace 
un  with  the  normal  flow  the  inequality  will  still 
hold. 

Very  similar  reasoning  applies  in  the  case  of  a  flat 
retina  (perspective  projection).  Given  an  optical  flow 
(h,  u)  at  the  feature  point  (jt,  ,  y,  )  all  possible  candi¬ 
date  points  for  the  AOR  are  on  the  right  of  the  line 


passing  through  (.»,,  >•,  )  and  parallel  to  {u,  v).  Fur¬ 
thermore,  they  all  lie  on  the  line  normal  to  (u,  v)  and 
originating  at  (jt;,  .Vi).  In  other  words  candidate  points 
(x,  y)  for  the  AOR  satisfy  the  inequality 

((a,  i',  0)  X  (x  —  X,,  >’  —  .vv,  0))(0,  0,  I)  <  0. 

This  inequality  indicates  that  the  z  component  of  the 
vector  product  of  the  optical  flow  vector  and  the  dif¬ 
ference  of  the  candidate  AOR  point  and  the  feature 
point  must  be  negative.  As  in  the  case  of  a  spherical 
retina  this  holds  even  when  the  optical  flow  (u,  v)  is 
replaced  by  the  normal  flow  ('i",  v").  As  was  done 
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in  the  case  of  translation,  voting  can  be  performed. 
Points  with  maximum  votes  are  candidates  for  the 
AOR.  If  a  minimum  is  sought  then  the  opposite  direc¬ 
tion  will  be  found,  if  the  area  is  closed  then  the  AOR 
is  localized  as  before;  otherwise  its  general  direction 
will  be  indicated  by  the  area  with  maximum  votes. 

An  analysis  (on  a  spherical  retina)  similar  to  the 
one  performed  for  the  case  of  dominant  translation 
can  be  performed  again.  This  time,  however,  the 

threshold  should  be  set  to  T,  =  r  =  If  the  mag- 

ll/ll 

nitude  of  the  normal  flow  is  greater  then  T,  then  it 
must  have  the  same  sign  (and  direction)  as  rotational 
normal  flow. 

When  a)  and  1  are  parallel  the  angular  radius  of 
the  uncertainty  region  is  equal  to  where  cot 

The  difference  in  the  angular  radii  of  the  un¬ 
certainty  areas  around  the  FOE  and  the  AOR  is  that 
the  tangent  is  replaced  by  the  cotangent.  When  6^,  >  Q 
the  uncertainty  area  around  the  AOR  changes  shape 
in  a  similar  manner  as  the  uncertainty  area  around  the 
FOE.  It  extends  in  the  direction  w  x  t  with  the  growth 
of  9^  and  gets  closer  to  the  AOR  in  the  opposite  di¬ 
rection. 

6  Experimental  Results 

We  have  performed  several  experiments  with  both 
synthetic  and  real  image  sequences  in  order  to  demon¬ 
strate  the  stability  of  the  method.  From  experiments 
on  real  images  it  was  found  that  in  the  case  of  pure 
translation  or  pure  rotation  the  method  computes  the 
Focus  of  Expansion  or  the  Axis  of  Rotation  very  ac¬ 
curately.  In  the  case  of  general  motion  it  was  found 
from  experiments  on  synthetic  data  that  the  behav¬ 
ior  of  the  method  is  as  predicted  by  our  theoretical 
analysis. 

6. 1  Synthetic  Data 

We  considered  a  set  of  features  at  random  depths 
(uniformly  distributed  in  a  range  to  ^ma*)- 
The  scene  was  imaged  using  a  spherical  retina  as 
in  Figure  15.  Optic  flow  and  normal  optic  flow 
were  computed  on  the  sphere  and  then  projected 
onto  the  tangent  plane  (see  Figure  15).  Normal  flow 
was  computed  by  considering  features  whose  orien¬ 
tations  were  produced  using  a  uniform  distribution. 


Fig.  15.  Sphere  OXKZ  iepresents  aspherical  retma(franie  OXKZ 
is  the  frame  of  the  observer).  The  translation  vector  r  is  along  the 
j  axis  and  the  rotation  axis  lies  on  the  plane  OZY.  Although  a 
spherical  retina  is  used  here,  information  is  used  only  from  a  patch 
of  the  sphere  defined  by  the  solid  angle  FOV  containing  the  view¬ 
ing  direction  vj  (defined  by  the  two  angles  $  and  v> — sse  Figure 
10).  The  spherical  image  patch  is  projected  stereographically  with 
center  S'  on  the  plane  P  tangent  to  the  sphere  at  N’.  and  having  a 
natural  coordinate  system  (f .  r;).  All  results  (solution  areas,  voting 
functions,  actual  and  normal  flow  fields)  are  projected  and  shown 
on  the  tangential  plane. 


Figures  16  to  20  show  one  set  of  experiments.  Fig¬ 
ure  16  shows  the  optic  flow  field  for  =  0°,  viewing 
angles  (0,  (p)  =  (O',  O'),  Rmin  =  10  tuul  Rmax  =  20  in 
units  of  focal  length,  ||r||  =  1,  Ic  =  — 

0.1  and  FOV  =  56"’.  Figure  17  shows  the  corre¬ 
sponding  normal  flow.  Similarly  Figures  18.  19  show 
optical  and  normal  flow  fields  for  the  same  conditions 
as  before  with  the  exception  that  k  =  0.75  which  is 
obtained  by  growing  Under  the  above  viewing 
conditions,  the  FOE  as  well  as  AOR  is  in  the  center 
of  the  image.  Figure  20  shows  results  of  voting  for 
determining  the  FOE.  In  the  first  column  thresholding 
precedes  voting,  with  T,  =  l|w|l/,  and  in  the  second 
column  there  is  no  thresholding.  In  the  first  row,  only 
the  area  with  the  maximum  number  of  votes  is  shown, 
while  in  the  second  row  the  whole  voting  function  is 
displayed  (black  is  maximum).  Clearly,  the  solution 
is  a  closed  area  (except  for  the  biggest  k)  whose  size 
grows  with  k. 
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Fig.  17. 


Fig.  19. 
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Thresholding  No  Thresholding 


k  =  0.75 


k  =  0.5 


k  =  O.-.'i 


k  =  0.1 


Fig.  20. 

Figures  21  to  25  show  the  second  set  of  exper- 
irrents.  The  only  change  from  the  first  set  is  that 
FVV  =  106’.  The  FOE  is  in  the  center  of  the  image. 
The  uncertainty  is  quite  small  due  to  bigger  field  of 
view.  This  was  predicted  in  our  analysis. 

Finally,  Figures  26  to  30  show  the  third  set  of 
experiments.  The  change  from  the  first  set  is  that 
=  45°.  As  was  predicted  the  solution  area  gets 
distorted  and  for  bigger  ||J;||  it  becomes  open.  In  case 
when  there  is  no  thresholding  before  voting  (second 
row)  this  appears  as  shift  in  the  estimated  position  of 
the  FOE. 


Fig  22. 
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Thresholding  No  Thresholding 


6.2  Real  Data 


Fig.  30. 


Figure  3 1  shows  one  of  the  images  from  a  dense  se¬ 
quence  collected  in  our  laboratory  using  an  American 
Merlin  Robot  Arm  carrying  a  miniature  CCD  Sony 
T.V.  camera  and  translating  along  the  camera’s  opti¬ 
cal  axis  (Figure  32).  Figure  33  shows  the  last  frame 
of  the  sequence  and  Figure  34  shows  the  normal  flow 
field  obtained.  Finally,  Figure  35  shows  the  first  frame 
with  the  solution  area  (where  the  FOE  lies),  which 
agrees  with  the  ground  truth. 

Figure  36  shows  the  first  from  a  series  of  images 
acquired  and  made  public  for  the  IEEE  1991  Work¬ 
shop  on  Motion  by  NASA  Ames  Research  Center. 
The  cana.a  is  moving  forward  (FOE  =  (232,240), 
which  is  in  our  images  (4  times  reduced)  in  the  mid¬ 
dle  of  the  white  area  of  the  Coca-Cola  can).  Figure 
37  shows  a  normal  flow  field  acquired  from  this  se¬ 
quence  and  Figure  38  shows  the  solution  area.  Figure 
39  shows  the  solution  area  superimposed  with  the  first 
frame,  which  contains  the  actual  solution. 

Figure  40  shows  the  first  of  a  series  of  images  col¬ 
lected  by  the  University  of  Massachusetts  at  Amherst 
and  made  public  for  the  IEEE  1991  Workshop  on  Mo¬ 
tion.  The  camera  was  mounted  on  a  robot  arm.  The 
upper  arm  of  the  robot  (shoulder  to  elbows)  is  ap¬ 
proximately  along  the  viewing  direction.  The  lower 
arm  (elbow  to  gripper)  is  normal  to  the  upper  arm 
(90  deg.).  The  camera  is  traveling  along  the  circle 
centered  at  the  elbow  and  the  axis  of  the  camera  is 
parallel  to  the  upper  arm.  Since  the  scene  is  5-lOm 
away  the  effect  is  one  of  the  rotation  about  the  axis 
parallel  to  the  viewing  direction  and  small  tran.sla- 
tion  normal  to  it  (FOE  at  infinity,  dominant  rotation, 
k  approximately  equal  to  0.1.  Figure  41  shows  the 
last  frame  of  the  sequence.  Figure  42  represents  the 
normal  flow  estimated  using  frames  3,  4  and  5.  Fig¬ 
ure  43  shows  the  results  of  voting  for  the  position 
of  the  AOR  and  Figure  44  shows  the  position  of  the 
AOR  superimposed  on  the  first  fhime. 

The  above  experiments  produced  very  good  results 
(actual  solution  always  inside  the  solution  area)  be¬ 
cause  there  was  either  dominant  translation  or  dom¬ 
inant  rotation.  The  experiment  below  demonstrates 
noisy  results  fm  the  FOE  and  the  AOR  because  trans¬ 
lation  and  rotation  have  about  the  same  proportion  on 
the  image."’  Figure  45  shows  the  first  from  a  series  of 
images  of  a  box  rmating  around  a  vratical  axis  passing 
through  the  middle  of  the  up-face,  and  collected  by 


k  =  0.; 


k  =  0.5 


k  =  0.25 


Fig.  31. 
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Fig.  35. 


Fig.  38. 
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Fig.  41. 


the  University  of  Massachusetts  at  Amherst  and  made 
public  for  the  1991  Workshop  on  Motion.  The  box  is 
rotating  around  the  shaft.  To  compare  our  algorithm’s 
results  with  ground  truth  we  need  to  understand  the 
object’s  motion  in  a  camera-centered  coordinate  sys¬ 
tem.  Since  there  is  a  distance  between  the  box  and 
the  camera  of  600  mm.  llie  [vu.uil  motion  produced 
by  the  box’s  rotation  is  equivalent  to  the  motion  pro¬ 
duced  by  a  general  motion  of  the  observer,  consisting 
of  both  rotation  and  translation.  The  axis  of  rotation 
is  parallel  to  the  shaft  pointing  downward  and  the 
translation  is  along  a  circle  centered  at  the  shaft  with 
radius  600  mm.  The  circle  lies  on  a  plane  normal 
to  the  shaft  and  the  direction  of  translation  is  to  the 
left  side  (with  the  FOE  at  infinity).  In  this  case  rota¬ 
tion  and  translation  are  of  about  the  same  proportion 
(/(  ~  1)  and  thus  the  results  for  both  the  FOE  and 
the  AOR  are  noisy.  Figure  46  shows  the  normal  flow 
field  obtained  from  the  first  three  images  of  the  se¬ 
quence,  and  figures  47  and  48  display  the  solutions 
for  the  FOE  and  the  AOR  respectively,  superimposed 
on  the  original  image. 


7  Conclusions 

A  technique  was  presented  for  computing  the  di¬ 
rection  of  ntotion  of  a  moving  observer  using  as 
input  the  normal  flow  field.  In  particular,  for  the 
actual  computation  only  the  direction  of  the  nor¬ 
mal  flow  is  used.  We  showed  theoretically  that  the 
method  works  robustly  even  whoi  some  amount  of 
rotation  is  present,  and  we  quantified  the  relation¬ 
ship  between  time-to-collision  and  magnitude  of  ro¬ 
tation  that  allows  the  method  to  work  correctly.  It 
has  been  shown  that  the  position  of  the  estimated 


r/.j?.  42. 


Fig.  43. 


Fig.  44. 
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Fig.  45. 


Fig.  48. 


Fig.  46. 


FOE  is  displaced  in  the  presence  of  rotation  and  this 
displacement  has  been  explained.  The  practical  sig¬ 
nificance  of  this  research  is  that  if  we  have  at  our 
disposal  an  inertial  sensor  whose  error  bounds  are 
known,  we  can  use  the  method  described  in  this  pa¬ 
per  to  obtain  a  machine  vision  system  that  can  ro¬ 
bustly  compute  the  beading  direction.  However,  if 
rotation  is  not  large,'*  then  the  method  can  still  re¬ 
liably  compute  the  direction  of  motion,  without  us¬ 
ing  inertial  sensor  information.  The  technique  cannot 
be  used  for  determining  the  translation  of  a  rigidly 
moving  object,  simply  because  the  area  on  the  im¬ 
age  where  voting  could  be  performed  is  relatively 
small.  See,  for  example.  Figure  49,  where  an  ob¬ 
ject  is  translating  parallel  to  the  optical  a-;is  (a), 
but  the  solution  area  is  open  (b)  (in  this  case  the 
FOE  =  (0,0)).  Finally,  the  same  analysis  described 
here  has  been  carried  out  for  a  different  coordinate 
system  (Duric  et  al.,  1993). 
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Notes 


1 .  One  can  also  diffetemiale  a  category  of  methods  that  use  cor¬ 
respondence  of  macrofeatuies  (contours,  lines,  sets  of  points, 
etc.)  (Aloimonos  and  Shulman  1989;  Spetsakis  and  Aloimonos 
1990),  but  we  don’t  discuss  them  here,  due  to  the  lack  of  lit¬ 
erature  on  the  stability  of  sudi  techniques. 

2.  As  in  photogrammetry,  for  example,  for  solving  die  problem 
of  lelttive  orientation  (Horn  1990). 

3.  Since  measurements  are  in  focal  length  units.  1%  error  in  dis- 
ptacemems  amounts  to  about  3-8  pixels  for  most  commeidally 
available  cmnetas. 

4.  In  the  case  of  backward  movement  the  situation  is  symmetric 
(maximum  -  minimum)  and  handled  similaly. 
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(a) 


(b) 


Fig.  49. 


5.  If  computatiun  of  normal  flow  al  some  points  is  unreliable,  we 
just  don’t  compute  normal  flow  there  (see  Section  6). 

6  In  the  sense  of  Figure  2;  we  assume  that  the  observer  usually 
looks  towards  where  it  is  moving. 

7.  These  two  vectors  are  not  defined  at  the  FOE,  i.e.  al  r  =  it, 
u,  —  0.  r  X  k  =  0,  but  this  is  the  only  singular  point. 

8.  This  is  true  regardless  of  rotation,  and  it  is  a  general  geometric 
result. 

9.  The  hemisphere  that  contains  the  FOE. 

10.  It  is  worth  noting  that  the  algorithm  in  (l?|)  will  produce  ac¬ 
curate  results  in  this  case. 

1 1.  How  large  is  large  hits  been  quantified  in  Section  5. 
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